BAYESIAN MODELS FOR DNA MICROARRAY DATA ANALYSIS A Dissertation by KYEONG
نویسندگان
چکیده
Bayesian Models for DNA Microarray Data Analysis. (May 2004) Kyeong Eun Lee, B.A., Kyungpook National University, Korea; M.A., Seoul National University, Korea Co–Chairs of Advisory Committee: Dr. Bani K. Mallick Dr. James A. Calvin Selection of significant genes via expression patterns is important in a microarray problem. Owing to small sample size and large number of variables (genes), the selection process can be unstable. This research proposes a hierarchical Bayesian model for gene (variable) selection. We employ latent variables in a regression setting and use a Bayesian mixture prior to perform the variable selection. Due to the binary nature of the data, the posterior distributions of the parameters are not in explicit form, and we need to use a combination of truncated sampling and Markov Chain Monte Carlo (MCMC) based computation techniques to simulate the posterior distributions. The Bayesian model is flexible enough to identify the significant genes as well as to perform future predictions. The method is applied to cancer classification via cDNA microarrays. In particular, the genes BRCA1 and BRCA2 are associated with a hereditary disposition to breast cancer, and the method is used to identify the set of significant genes to classify BRCA1 and others. Microarray data can also be applied to survival models. We address the issue of how to reduce the dimension in building model by selecting significant genes as well as assessing the estimated survival curves. Additionally, we consider the well-
منابع مشابه
Reduced Representations for Efficient Analysis of Genomic Data; from Microarray to High-throughput Sequencing
OF THE DISSERTATION Reduced Representations for Efficient Analysis of Genomic Data; From Microarray to High-throughput Sequencing by Md Pavel Mahmud Dissertation Director: Prof. Alexander Schliep Since the genomics era has started in the ’70s, microarray technologies have been extensively used for biological applications such as gene expression profiling, copy number variation (CNV) or Single N...
متن کاملDNA Microarrays and Gene Expression - From Experiments to Data Analysis and Modeling
dna microarrays and gene expression assets dna microarrays and gene expression from experiments to dna microarrays and gene expression: from experiments to dna microarrays and gene expression dna microarrays and gene expressionfrom experiments to dna microarrays and gene expression: from experiments to dna microarrays and computational analysis final sln a4.... microarray data integration and t...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملBayesian Melding of Deterministic Models and Kriging for Analysis of Spatially Dependent Data
The link between geographic information systems and decision making approach own the invention and development of spatial data melding method. These methods combine different data sets, to achieve better results. In this paper, the Bayesian melding method for combining the measurements and outputs of deterministic models and kriging are considered. Then the ozone data in Tehran city are analyze...
متن کاملDiagnosis of Breast Cancer Subtypes using the Selection of Effective Genes from Microarray Data
Introduction: Early diagnosis of breast cancer and the identification of effective genes are important issues in the treatment and survival of the patients. Gene expression data obtained using DNA microarray in combination with machine learning algorithms can provide new and intelligent methods for diagnosis of breast cancer. Methods: Data on the expression of 9216 genes from 84 patients across...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004